Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 978949 |
| Missing cells | 170252 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 119.5 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 40240 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 42563 (4.3%) missing values | Missing |
jerseyNumber has 42563 (4.3%) missing values | Missing |
o has 42563 (4.3%) missing values | Missing |
dir has 42563 (4.3%) missing values | Missing |
s has 60657 (6.2%) zeros | Zeros |
a has 56618 (5.8%) zeros | Zeros |
dis has 62790 (6.4%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 14:58:25.170610 |
|---|---|
| Analysis finished | 2022-11-02 14:59:54.147409 |
| Duration | 1 minute and 28.98 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021103636 |
| Minimum | 2021102800 |
|---|---|
| Maximum | 2021110100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 2021102800 |
|---|---|
| 5-th percentile | 2021102800 |
| Q1 | 2021103103 |
| median | 2021103107 |
| Q3 | 2021103110 |
| 95-th percentile | 2021110100 |
| Maximum | 2021110100 |
| Range | 7300 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 1880.878366 |
|---|---|
| Coefficient of variation (CV) | 9.306194561 × 10-7 |
| Kurtosis | 7.884864592 |
| Mean | 2021103636 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 3.140901101 |
| Sum | 1.978557383 × 1015 |
| Variance | 3537703.428 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021103106 | 82363 | 8.4% |
| 2021103112 | 78752 | 8.0% |
| 2021110100 | 76314 | 7.8% |
| 2021103109 | 72197 | 7.4% |
| 2021103108 | 72013 | 7.4% |
| 2021103111 | 70771 | 7.2% |
| 2021103107 | 69483 | 7.1% |
| 2021103101 | 64285 | 6.6% |
| 2021103110 | 61502 | 6.3% |
| 2021103105 | 61088 | 6.2% |
| Other values (5) | 270181 |
| Value | Count | Frequency (%) |
| 2021102800 | 51060 | |
| 2021103100 | 50462 | |
| 2021103101 | 64285 | |
| 2021103102 | 58075 | |
| 2021103103 | 54901 | |
| 2021103104 | 55683 | |
| 2021103105 | 61088 | |
| 2021103106 | 82363 | |
| 2021103107 | 69483 | |
| 2021103108 | 72013 |
| Value | Count | Frequency (%) |
| 2021110100 | 76314 | |
| 2021103112 | 78752 | |
| 2021103111 | 70771 | |
| 2021103110 | 61502 | |
| 2021103109 | 72197 | |
| 2021103108 | 72013 | |
| 2021103107 | 69483 | |
| 2021103106 | 82363 | |
| 2021103105 | 61088 | |
| 2021103104 | 55683 |
playId
Real number (ℝ≥0)
| Distinct | 913 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2172.301201 |
| Minimum | 54 |
|---|---|
| Maximum | 4750 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 54 |
|---|---|
| 5-th percentile | 230 |
| Q1 | 1144 |
| median | 2120 |
| Q3 | 3259 |
| 95-th percentile | 4109 |
| Maximum | 4750 |
| Range | 4696 |
| Interquartile range (IQR) | 2115 |
Descriptive statistics
| Standard deviation | 1249.681091 |
|---|---|
| Coefficient of variation (CV) | 0.5752798417 |
| Kurtosis | -1.151286371 |
| Mean | 2172.301201 |
| Median Absolute Deviation (MAD) | 1073 |
| Skewness | 0.04065043933 |
| Sum | 2126572088 |
| Variance | 1561702.829 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2954 | 5152 | 0.5% |
| 2120 | 4393 | 0.4% |
| 1182 | 3335 | 0.3% |
| 3617 | 3289 | 0.3% |
| 62 | 2898 | 0.3% |
| 3948 | 2829 | 0.3% |
| 765 | 2806 | 0.3% |
| 189 | 2806 | 0.3% |
| 3279 | 2783 | 0.3% |
| 1970 | 2691 | 0.3% |
| Other values (903) | 945967 |
| Value | Count | Frequency (%) |
| 54 | 897 | 0.1% |
| 55 | 2070 | |
| 62 | 2898 | |
| 75 | 1564 | |
| 76 | 713 | 0.1% |
| 79 | 1886 | |
| 84 | 759 | 0.1% |
| 86 | 690 | 0.1% |
| 97 | 782 | 0.1% |
| 99 | 1840 |
| Value | Count | Frequency (%) |
| 4750 | 713 | |
| 4728 | 690 | |
| 4670 | 1081 | |
| 4625 | 759 | |
| 4603 | 966 | |
| 4561 | 1058 | |
| 4518 | 989 | |
| 4515 | 1242 | |
| 4489 | 1173 | |
| 4478 | 851 |
| Distinct | 1087 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 42563 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45886.90232 |
| Minimum | 25511 |
|---|---|
| Maximum | 54038 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 38538 |
| Q1 | 42471 |
| median | 45785 |
| Q3 | 48455 |
| 95-th percentile | 53502 |
| Maximum | 54038 |
| Range | 28527 |
| Interquartile range (IQR) | 5984 |
Descriptive statistics
| Standard deviation | 5020.932834 |
|---|---|
| Coefficient of variation (CV) | 0.1094197381 |
| Kurtosis | -0.06878397042 |
| Mean | 45886.90232 |
| Median Absolute Deviation (MAD) | 3242 |
| Skewness | -0.1922299703 |
| Sum | 4.296785291 × 1010 |
| Variance | 25209766.53 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 46106 | 2124 | 0.2% |
| 43353 | 2124 | 0.2% |
| 38569 | 2124 | 0.2% |
| 43291 | 2124 | 0.2% |
| 43307 | 2124 | 0.2% |
| 39947 | 2124 | 0.2% |
| 52442 | 2124 | 0.2% |
| 47971 | 2124 | 0.2% |
| 46110 | 2124 | 0.2% |
| 46075 | 2124 | 0.2% |
| Other values (1077) | 915146 | |
| (Missing) | 42563 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 1440 | |
| 28963 | 1095 | |
| 29550 | 1517 | |
| 29851 | 953 | |
| 30842 | 422 | < 0.1% |
| 33084 | 1248 | |
| 33107 | 1181 | |
| 33241 | 75 | < 0.1% |
| 33566 | 1161 | |
| 34452 | 1042 |
| Value | Count | Frequency (%) |
| 54038 | 148 | < 0.1% |
| 53994 | 100 | < 0.1% |
| 53960 | 127 | < 0.1% |
| 53959 | 243 | < 0.1% |
| 53954 | 100 | < 0.1% |
| 53953 | 695 | |
| 53946 | 276 | < 0.1% |
| 53921 | 82 | < 0.1% |
| 53910 | 313 | < 0.1% |
| 53900 | 923 |
| Distinct | 97 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.51631699 |
| Minimum | 1 |
|---|---|
| Maximum | 97 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 21 |
| Q3 | 32 |
| 95-th percentile | 48 |
| Maximum | 97 |
| Range | 96 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 14.53702383 |
|---|---|
| Coefficient of variation (CV) | 0.6456217434 |
| Kurtosis | 0.5457487922 |
| Mean | 22.51631699 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.726804708 |
| Sum | 22042326 |
| Variance | 211.3250618 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 23736 | 2.4% |
| 12 | 23736 | 2.4% |
| 21 | 23736 | 2.4% |
| 20 | 23736 | 2.4% |
| 19 | 23736 | 2.4% |
| 18 | 23736 | 2.4% |
| 17 | 23736 | 2.4% |
| 16 | 23736 | 2.4% |
| 15 | 23736 | 2.4% |
| 14 | 23736 | 2.4% |
| Other values (87) | 741589 |
| Value | Count | Frequency (%) |
| 1 | 23736 | |
| 2 | 23736 | |
| 3 | 23736 | |
| 4 | 23736 | |
| 5 | 23736 | |
| 6 | 23736 | |
| 7 | 23736 | |
| 8 | 23736 | |
| 9 | 23736 | |
| 10 | 23736 |
| Value | Count | Frequency (%) |
| 97 | 23 | < 0.1% |
| 96 | 23 | < 0.1% |
| 95 | 23 | < 0.1% |
| 94 | 23 | < 0.1% |
| 93 | 46 | |
| 92 | 46 | |
| 91 | 69 | |
| 90 | 69 | |
| 89 | 69 | |
| 88 | 69 |
| Distinct | 40240 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.5 MiB |
| 2021-10-31T19:31:54.400 | 69 |
|---|---|
| 2021-10-31T17:28:05.000 | 69 |
| 2021-10-31T17:28:04.100 | 69 |
| 2021-10-31T17:28:04.200 | 69 |
| 2021-10-31T17:28:04.300 | 69 |
| Other values (40235) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 22515827 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2021-10-29T00:27:23.000 |
|---|---|
| 2nd row | 2021-10-29T00:27:23.100 |
| 3rd row | 2021-10-29T00:27:23.200 |
| 4th row | 2021-10-29T00:27:23.300 |
| 5th row | 2021-10-29T00:27:23.400 |
Common Values
| Value | Count | Frequency (%) |
| 2021-10-31T19:31:54.400 | 69 | < 0.1% |
| 2021-10-31T17:28:05.000 | 69 | < 0.1% |
| 2021-10-31T17:28:04.100 | 69 | < 0.1% |
| 2021-10-31T17:28:04.200 | 69 | < 0.1% |
| 2021-10-31T17:28:04.300 | 69 | < 0.1% |
| 2021-10-31T17:28:04.400 | 69 | < 0.1% |
| 2021-10-31T17:28:04.500 | 69 | < 0.1% |
| 2021-10-31T17:28:04.600 | 69 | < 0.1% |
| 2021-10-31T17:28:04.700 | 69 | < 0.1% |
| 2021-10-31T17:28:04.800 | 69 | < 0.1% |
| Other values (40230) | 978259 |
Length
| Value | Count | Frequency (%) |
| 2021-10-31t19:31:54.400 | 69 | < 0.1% |
| 2021-10-31t19:31:54.700 | 69 | < 0.1% |
| 2021-10-31t18:58:16.700 | 69 | < 0.1% |
| 2021-10-31t18:58:16.600 | 69 | < 0.1% |
| 2021-10-31t18:58:16.500 | 69 | < 0.1% |
| 2021-10-31t18:58:16.400 | 69 | < 0.1% |
| 2021-10-31t18:58:16.300 | 69 | < 0.1% |
| 2021-10-31t19:28:46.800 | 69 | < 0.1% |
| 2021-10-31t18:05:12.000 | 69 | < 0.1% |
| 2021-10-31t18:58:16.100 | 69 | < 0.1% |
| Other values (40230) | 978259 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4896907 | |
| 1 | 4177306 | |
| 2 | 3180808 | |
| - | 1957898 | 8.7% |
| : | 1957898 | 8.7% |
| 3 | 1457717 | 6.5% |
| T | 978949 | 4.3% |
| . | 978949 | 4.3% |
| 4 | 603612 | 2.7% |
| 5 | 602830 | 2.7% |
| Other values (4) | 1722953 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16642133 | |
| Other Punctuation | 2936847 | 13.0% |
| Dash Punctuation | 1957898 | 8.7% |
| Uppercase Letter | 978949 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4896907 | |
| 1 | 4177306 | |
| 2 | 3180808 | |
| 3 | 1457717 | 8.8% |
| 4 | 603612 | 3.6% |
| 5 | 602830 | 3.6% |
| 9 | 515729 | 3.1% |
| 7 | 459379 | 2.8% |
| 8 | 448661 | 2.7% |
| 6 | 299184 | 1.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 1957898 | |
| . | 978949 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1957898 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 978949 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 21536878 | |
| Latin | 978949 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4896907 | |
| 1 | 4177306 | |
| 2 | 3180808 | |
| - | 1957898 | 9.1% |
| : | 1957898 | 9.1% |
| 3 | 1457717 | 6.8% |
| . | 978949 | 4.5% |
| 4 | 603612 | 2.8% |
| 5 | 602830 | 2.8% |
| 9 | 515729 | 2.4% |
| Other values (3) | 1207224 | 5.6% |
Latin
| Value | Count | Frequency (%) |
| T | 978949 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 22515827 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4896907 | |
| 1 | 4177306 | |
| 2 | 3180808 | |
| - | 1957898 | 8.7% |
| : | 1957898 | 8.7% |
| 3 | 1457717 | 6.5% |
| T | 978949 | 4.3% |
| . | 978949 | 4.3% |
| 4 | 603612 | 2.7% |
| 5 | 602830 | 2.7% |
| Other values (4) | 1722953 | 7.7% |
| Distinct | 98 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 42563 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.15307149 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 23 |
| median | 52 |
| Q3 | 76 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 53 |
Descriptive statistics
| Standard deviation | 29.85567387 |
|---|---|
| Coefficient of variation (CV) | 0.5952910357 |
| Kurtosis | -1.34391869 |
| Mean | 50.15307149 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.02970864493 |
| Sum | 46962634 |
| Variance | 891.3612624 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 22 | 19451 | 2.0% |
| 24 | 18736 | 1.9% |
| 21 | 18678 | 1.9% |
| 11 | 18422 | 1.9% |
| 97 | 17746 | 1.8% |
| 23 | 16895 | 1.7% |
| 72 | 16347 | 1.7% |
| 71 | 16153 | 1.7% |
| 20 | 15904 | 1.6% |
| 26 | 15626 | 1.6% |
| Other values (88) | 762428 | |
| (Missing) | 42563 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 14516 | |
| 2 | 13607 | |
| 3 | 2422 | 0.2% |
| 4 | 7823 | |
| 5 | 5529 | 0.6% |
| 6 | 6044 | 0.6% |
| 7 | 7516 | |
| 8 | 8104 | |
| 9 | 6009 | 0.6% |
| 10 | 15113 |
| Value | Count | Frequency (%) |
| 99 | 11078 | |
| 98 | 12730 | |
| 97 | 17746 | |
| 96 | 8984 | |
| 95 | 10028 | |
| 94 | 14303 | |
| 93 | 10202 | |
| 92 | 7397 | |
| 91 | 14280 | |
| 90 | 14794 |
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.5 MiB |
| football | 42563 |
|---|---|
| TEN | 39391 |
| IND | 39391 |
| DAL | 37664 |
| MIN | 37664 |
| Other values (26) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 2.99261555 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2929618 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | ARI |
|---|---|
| 2nd row | ARI |
| 3rd row | ARI |
| 4th row | ARI |
| 5th row | ARI |
Common Values
| Value | Count | Frequency (%) |
| football | 42563 | 4.3% |
| TEN | 39391 | 4.0% |
| IND | 39391 | 4.0% |
| DAL | 37664 | 3.8% |
| MIN | 37664 | 3.8% |
| KC | 36498 | 3.7% |
| NYG | 36498 | 3.7% |
| SEA | 34529 | 3.5% |
| JAX | 34529 | 3.5% |
| LAC | 34441 | 3.5% |
| Other values (21) | 605781 |
Length
| Value | Count | Frequency (%) |
| football | 42563 | 4.3% |
| ind | 39391 | 4.0% |
| ten | 39391 | 4.0% |
| dal | 37664 | 3.8% |
| min | 37664 | 3.8% |
| kc | 36498 | 3.7% |
| nyg | 36498 | 3.7% |
| sea | 34529 | 3.5% |
| jax | 34529 | 3.5% |
| lac | 34441 | 3.5% |
| Other values (21) | 605781 |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 317108 | 10.8% |
| A | 303226 | 10.4% |
| I | 246114 | 8.4% |
| E | 190663 | 6.5% |
| C | 182336 | 6.2% |
| L | 151712 | 5.2% |
| T | 150260 | 5.1% |
| D | 133100 | 4.5% |
| S | 91718 | 3.1% |
| B | 89012 | 3.0% |
| Other values (19) | 1074369 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2589114 | |
| Lowercase Letter | 340504 | 11.6% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 317108 | |
| A | 303226 | |
| I | 246114 | 9.5% |
| E | 190663 | 7.4% |
| C | 182336 | 7.0% |
| L | 151712 | 5.9% |
| T | 150260 | 5.8% |
| D | 133100 | 5.1% |
| S | 91718 | 3.5% |
| B | 89012 | 3.4% |
| Other values (13) | 733865 |
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 85126 | |
| o | 85126 | |
| f | 42563 | |
| a | 42563 | |
| b | 42563 | |
| t | 42563 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2929618 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 317108 | 10.8% |
| A | 303226 | 10.4% |
| I | 246114 | 8.4% |
| E | 190663 | 6.5% |
| C | 182336 | 6.2% |
| L | 151712 | 5.2% |
| T | 150260 | 5.1% |
| D | 133100 | 4.5% |
| S | 91718 | 3.1% |
| B | 89012 | 3.0% |
| Other values (19) | 1074369 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2929618 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 317108 | 10.8% |
| A | 303226 | 10.4% |
| I | 246114 | 8.4% |
| E | 190663 | 6.5% |
| C | 182336 | 6.2% |
| L | 151712 | 5.2% |
| T | 150260 | 5.1% |
| D | 133100 | 4.5% |
| S | 91718 | 3.1% |
| B | 89012 | 3.0% |
| Other values (19) | 1074369 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.5 MiB |
| left | |
|---|---|
| right |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.471418838 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4377291 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| left | 517454 | |
| right | 461495 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| left | 517454 | |
| right | 461495 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 978949 | |
| l | 517454 | |
| e | 517454 | |
| f | 517454 | |
| r | 461495 | |
| i | 461495 | |
| g | 461495 | |
| h | 461495 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4377291 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 978949 | |
| l | 517454 | |
| e | 517454 | |
| f | 517454 | |
| r | 461495 | |
| i | 461495 | |
| g | 461495 | |
| h | 461495 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4377291 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 978949 | |
| l | 517454 | |
| e | 517454 | |
| f | 517454 | |
| r | 461495 | |
| i | 461495 | |
| g | 461495 | |
| h | 461495 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4377291 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 978949 | |
| l | 517454 | |
| e | 517454 | |
| f | 517454 | |
| r | 461495 | |
| i | 461495 | |
| g | 461495 | |
| h | 461495 |
x
Real number (ℝ≥0)
| Distinct | 11695 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 58.58592922 |
| Minimum | 0.47 |
|---|---|
| Maximum | 119.98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0.47 |
|---|---|
| 5-th percentile | 17.88 |
| Q1 | 38.81 |
| median | 58.84 |
| Q3 | 77.78 |
| 95-th percentile | 99.6 |
| Maximum | 119.98 |
| Range | 119.51 |
| Interquartile range (IQR) | 38.97 |
Descriptive statistics
| Standard deviation | 24.92508577 |
|---|---|
| Coefficient of variation (CV) | 0.4254449165 |
| Kurtosis | -0.8152433424 |
| Mean | 58.58592922 |
| Median Absolute Deviation (MAD) | 19.48 |
| Skewness | 0.0216178471 |
| Sum | 57352636.82 |
| Variance | 621.2599004 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 66.3 | 179 | < 0.1% |
| 66.34 | 176 | < 0.1% |
| 75.05 | 172 | < 0.1% |
| 68.67 | 171 | < 0.1% |
| 63.53 | 171 | < 0.1% |
| 69.19 | 171 | < 0.1% |
| 72.14 | 171 | < 0.1% |
| 68.69 | 171 | < 0.1% |
| 74.11 | 170 | < 0.1% |
| 33.66 | 170 | < 0.1% |
| Other values (11685) | 977227 |
| Value | Count | Frequency (%) |
| 0.47 | 1 | |
| 0.72 | 1 | |
| 0.78 | 1 | |
| 0.86 | 1 | |
| 0.95 | 1 | |
| 0.99 | 1 | |
| 1.06 | 1 | |
| 1.18 | 1 | |
| 1.25 | 1 | |
| 1.28 | 1 |
| Value | Count | Frequency (%) |
| 119.98 | 1 | |
| 119.97 | 2 | |
| 119.95 | 1 | |
| 119.94 | 1 | |
| 119.91 | 1 | |
| 119.89 | 1 | |
| 119.81 | 1 | |
| 119.69 | 1 | |
| 119.55 | 1 | |
| 119.51 | 1 |
y
Real number (ℝ)
| Distinct | 5333 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.71448044 |
| Minimum | -4.53 |
|---|---|
| Maximum | 53.63 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 65 |
| Negative (%) | < 0.1% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | -4.53 |
|---|---|
| 5-th percentile | 11.66 |
| Q1 | 21.88 |
| median | 26.68 |
| Q3 | 31.49 |
| 95-th percentile | 42.14 |
| Maximum | 53.63 |
| Range | 58.16 |
| Interquartile range (IQR) | 9.61 |
Descriptive statistics
| Standard deviation | 8.305238378 |
|---|---|
| Coefficient of variation (CV) | 0.3108890101 |
| Kurtosis | 0.3033364944 |
| Mean | 26.71448044 |
| Median Absolute Deviation (MAD) | 4.81 |
| Skewness | 0.02051996881 |
| Sum | 26152113.91 |
| Variance | 68.97698452 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23.81 | 1167 | 0.1% |
| 23.78 | 1099 | 0.1% |
| 23.87 | 1044 | 0.1% |
| 29.76 | 1023 | 0.1% |
| 23.83 | 1018 | 0.1% |
| 23.86 | 1014 | 0.1% |
| 23.75 | 1000 | 0.1% |
| 23.79 | 990 | 0.1% |
| 23.76 | 986 | 0.1% |
| 23.77 | 982 | 0.1% |
| Other values (5323) | 968626 |
| Value | Count | Frequency (%) |
| -4.53 | 1 | |
| -4.21 | 1 | |
| -3.91 | 1 | |
| -3.56 | 1 | |
| -3.26 | 1 | |
| -3.18 | 1 | |
| -2.98 | 1 | |
| -2.86 | 1 | |
| -2.63 | 1 | |
| -2.61 | 1 |
| Value | Count | Frequency (%) |
| 53.63 | 1 | < 0.1% |
| 53.35 | 1 | < 0.1% |
| 53.33 | 1 | < 0.1% |
| 53.25 | 1 | < 0.1% |
| 53.23 | 2 | |
| 53.22 | 1 | < 0.1% |
| 53.21 | 1 | < 0.1% |
| 53.19 | 1 | < 0.1% |
| 53.18 | 2 | |
| 53.17 | 3 |
| Distinct | 2173 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.588453495 |
| Minimum | 0 |
|---|---|
| Maximum | 28.16 |
| Zeros | 60657 |
| Zeros (%) | 6.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.76 |
| median | 2.15 |
| Q3 | 3.82 |
| 95-th percentile | 6.76 |
| Maximum | 28.16 |
| Range | 28.16 |
| Interquartile range (IQR) | 3.06 |
Descriptive statistics
| Standard deviation | 2.393862993 |
|---|---|
| Coefficient of variation (CV) | 0.9248236438 |
| Kurtosis | 14.50736989 |
| Mean | 2.588453495 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 2.372197137 |
| Sum | 2533963.96 |
| Variance | 5.730580028 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 60657 | 6.2% |
| 0.01 | 15215 | 1.6% |
| 0.02 | 8942 | 0.9% |
| 0.03 | 6575 | 0.7% |
| 0.04 | 5304 | 0.5% |
| 0.05 | 4702 | 0.5% |
| 0.06 | 4191 | 0.4% |
| 0.07 | 4002 | 0.4% |
| 0.08 | 3736 | 0.4% |
| 0.09 | 3478 | 0.4% |
| Other values (2163) | 862147 |
| Value | Count | Frequency (%) |
| 0 | 60657 | |
| 0.01 | 15215 | 1.6% |
| 0.02 | 8942 | 0.9% |
| 0.03 | 6575 | 0.7% |
| 0.04 | 5304 | 0.5% |
| 0.05 | 4702 | 0.5% |
| 0.06 | 4191 | 0.4% |
| 0.07 | 4002 | 0.4% |
| 0.08 | 3736 | 0.4% |
| 0.09 | 3478 | 0.4% |
| Value | Count | Frequency (%) |
| 28.16 | 1 | |
| 28.02 | 1 | |
| 27.71 | 1 | |
| 27.55 | 1 | |
| 27.34 | 1 | |
| 27.3 | 1 | |
| 27.26 | 1 | |
| 27.21 | 1 | |
| 27.01 | 1 | |
| 27 | 1 |
| Distinct | 1518 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.80429337 |
| Minimum | 0 |
|---|---|
| Maximum | 33.43 |
| Zeros | 56618 |
| Zeros (%) | 5.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.72 |
| median | 1.55 |
| Q3 | 2.6 |
| 95-th percentile | 4.49 |
| Maximum | 33.43 |
| Range | 33.43 |
| Interquartile range (IQR) | 1.88 |
Descriptive statistics
| Standard deviation | 1.442816763 |
|---|---|
| Coefficient of variation (CV) | 0.7996575206 |
| Kurtosis | 6.401140116 |
| Mean | 1.80429337 |
| Median Absolute Deviation (MAD) | 0.92 |
| Skewness | 1.409875764 |
| Sum | 1766311.19 |
| Variance | 2.08172021 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 56618 | 5.8% |
| 0.01 | 11996 | 1.2% |
| 0.02 | 6956 | 0.7% |
| 0.03 | 5202 | 0.5% |
| 0.04 | 3956 | 0.4% |
| 0.05 | 3360 | 0.3% |
| 1.34 | 3126 | 0.3% |
| 1.28 | 3114 | 0.3% |
| 1.05 | 3089 | 0.3% |
| 1.06 | 3073 | 0.3% |
| Other values (1508) | 878459 |
| Value | Count | Frequency (%) |
| 0 | 56618 | |
| 0.01 | 11996 | 1.2% |
| 0.02 | 6956 | 0.7% |
| 0.03 | 5202 | 0.5% |
| 0.04 | 3956 | 0.4% |
| 0.05 | 3360 | 0.3% |
| 0.06 | 2989 | 0.3% |
| 0.07 | 2759 | 0.3% |
| 0.08 | 2330 | 0.2% |
| 0.09 | 2306 | 0.2% |
| Value | Count | Frequency (%) |
| 33.43 | 1 | |
| 32.57 | 1 | |
| 30.98 | 1 | |
| 28.72 | 1 | |
| 26.58 | 1 | |
| 26.28 | 1 | |
| 25.61 | 1 | |
| 24.16 | 1 | |
| 23.78 | 1 | |
| 23.62 | 1 |
| Distinct | 535 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.261897249 |
| Minimum | 0 |
|---|---|
| Maximum | 10.45 |
| Zeros | 62790 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.22 |
| Q3 | 0.38 |
| 95-th percentile | 0.68 |
| Maximum | 10.45 |
| Range | 10.45 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.2563842877 |
|---|---|
| Coefficient of variation (CV) | 0.9789499077 |
| Kurtosis | 51.57842879 |
| Mean | 0.261897249 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.235698019 |
| Sum | 256384.05 |
| Variance | 0.06573290299 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 62790 | 6.4% |
| 0.01 | 53725 | 5.5% |
| 0.02 | 29770 | 3.0% |
| 0.03 | 22761 | 2.3% |
| 0.04 | 20019 | 2.0% |
| 0.05 | 18459 | 1.9% |
| 0.2 | 18137 | 1.9% |
| 0.21 | 17919 | 1.8% |
| 0.17 | 17857 | 1.8% |
| 0.19 | 17820 | 1.8% |
| Other values (525) | 699692 |
| Value | Count | Frequency (%) |
| 0 | 62790 | |
| 0.01 | 53725 | |
| 0.02 | 29770 | |
| 0.03 | 22761 | 2.3% |
| 0.04 | 20019 | 2.0% |
| 0.05 | 18459 | 1.9% |
| 0.06 | 17699 | 1.8% |
| 0.07 | 17235 | 1.8% |
| 0.08 | 17096 | 1.7% |
| 0.09 | 17069 | 1.7% |
| Value | Count | Frequency (%) |
| 10.45 | 1 | |
| 8.53 | 1 | |
| 6.67 | 2 | |
| 6.41 | 1 | |
| 6.23 | 1 | |
| 6.13 | 1 | |
| 6.12 | 1 | |
| 6.1 | 1 | |
| 6.08 | 2 | |
| 6.02 | 1 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 42563 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.1493396 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 9 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 32.69 |
| Q1 | 90 |
| median | 179.68 |
| Q3 | 269.29 |
| 95-th percentile | 328.67 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.29 |
Descriptive statistics
| Standard deviation | 98.62727364 |
|---|---|
| Coefficient of variation (CV) | 0.5474750773 |
| Kurtosis | -1.368621125 |
| Mean | 180.1493396 |
| Median Absolute Deviation (MAD) | 89.68 |
| Skewness | 0.002297522931 |
| Sum | 168689319.5 |
| Variance | 9727.339106 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 1570 | 0.2% |
| 266.33 | 104 | < 0.1% |
| 90.59 | 95 | < 0.1% |
| 86.05 | 95 | < 0.1% |
| 91.34 | 94 | < 0.1% |
| 90.93 | 94 | < 0.1% |
| 88.81 | 92 | < 0.1% |
| 93.93 | 92 | < 0.1% |
| 271.66 | 92 | < 0.1% |
| 268.52 | 92 | < 0.1% |
| Other values (35991) | 933966 | |
| (Missing) | 42563 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 0.01 | 11 | |
| 0.02 | 22 | |
| 0.03 | 11 | |
| 0.04 | 13 | |
| 0.05 | 16 | |
| 0.06 | 13 | |
| 0.07 | 22 | |
| 0.08 | 17 | |
| 0.09 | 17 |
| Value | Count | Frequency (%) |
| 360 | 10 | |
| 359.99 | 9 | |
| 359.98 | 18 | |
| 359.97 | 13 | |
| 359.96 | 15 | |
| 359.95 | 13 | |
| 359.94 | 13 | |
| 359.93 | 14 | |
| 359.92 | 8 | |
| 359.91 | 13 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 42563 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.7110978 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 8 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 24.79 |
| Q1 | 91.61 |
| median | 180.29 |
| Q3 | 270.36 |
| 95-th percentile | 336.32 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 178.75 |
Descriptive statistics
| Standard deviation | 100.5814224 |
|---|---|
| Coefficient of variation (CV) | 0.5565868594 |
| Kurtosis | -1.2830718 |
| Mean | 180.7110978 |
| Median Absolute Deviation (MAD) | 89.35 |
| Skewness | -0.0003477824361 |
| Sum | 169215342 |
| Variance | 10116.62253 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 266.6 | 70 | < 0.1% |
| 89.93 | 70 | < 0.1% |
| 261.1 | 69 | < 0.1% |
| 88.65 | 68 | < 0.1% |
| 88.13 | 67 | < 0.1% |
| 263.01 | 67 | < 0.1% |
| 267.77 | 67 | < 0.1% |
| 87.39 | 66 | < 0.1% |
| 96.66 | 66 | < 0.1% |
| 272.41 | 66 | < 0.1% |
| Other values (35991) | 935710 | |
| (Missing) | 42563 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 8 | < 0.1% |
| 0.01 | 25 | |
| 0.02 | 20 | |
| 0.03 | 11 | |
| 0.04 | 18 | |
| 0.05 | 16 | |
| 0.06 | 15 | |
| 0.07 | 20 | |
| 0.08 | 19 | |
| 0.09 | 24 |
| Value | Count | Frequency (%) |
| 360 | 6 | < 0.1% |
| 359.99 | 19 | |
| 359.98 | 23 | |
| 359.97 | 23 | |
| 359.96 | 22 | |
| 359.95 | 28 | |
| 359.94 | 15 | |
| 359.93 | 16 | |
| 359.92 | 19 | |
| 359.91 | 22 |
event
Categorical
| Distinct | 23 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.5 MiB |
| None | |
|---|---|
| ball_snap | 23713 |
| pass_forward | 21137 |
| autoevent_ballsnap | 10534 |
| autoevent_passforward | 10143 |
| Other values (18) | 11454 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.697930127 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4599034 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 901968 | |
| ball_snap | 23713 | 2.4% |
| pass_forward | 21137 | 2.2% |
| autoevent_ballsnap | 10534 | 1.1% |
| autoevent_passforward | 10143 | 1.0% |
| play_action | 5382 | 0.5% |
| qb_sack | 1265 | 0.1% |
| run | 1058 | 0.1% |
| pass_arrived | 989 | 0.1% |
| autoevent_passinterrupted | 483 | < 0.1% |
| Other values (13) | 2277 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 901968 | |
| ball_snap | 23713 | 2.4% |
| pass_forward | 21137 | 2.2% |
| autoevent_ballsnap | 10534 | 1.1% |
| autoevent_passforward | 10143 | 1.0% |
| play_action | 5382 | 0.5% |
| qb_sack | 1265 | 0.1% |
| run | 1058 | 0.1% |
| pass_arrived | 989 | 0.1% |
| autoevent_passinterrupted | 483 | < 0.1% |
| Other values (13) | 2277 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 965862 | |
| o | 961262 | |
| e | 948727 | |
| N | 901968 | |
| a | 168176 | 3.7% |
| s | 103822 | 2.3% |
| _ | 76222 | 1.7% |
| p | 74796 | 1.6% |
| l | 74681 | 1.6% |
| r | 67183 | 1.5% |
| Other values (15) | 256335 | 5.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3620844 | |
| Uppercase Letter | 901968 | 19.6% |
| Connector Punctuation | 76222 | 1.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 965862 | |
| o | 961262 | |
| e | 948727 | |
| a | 168176 | 4.6% |
| s | 103822 | 2.9% |
| p | 74796 | 2.1% |
| l | 74681 | 2.1% |
| r | 67183 | 1.9% |
| t | 51451 | 1.4% |
| b | 35880 | 1.0% |
| Other values (13) | 169004 | 4.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 901968 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 76222 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4522812 | |
| Common | 76222 | 1.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 965862 | |
| o | 961262 | |
| e | 948727 | |
| N | 901968 | |
| a | 168176 | 3.7% |
| s | 103822 | 2.3% |
| p | 74796 | 1.7% |
| l | 74681 | 1.7% |
| r | 67183 | 1.5% |
| t | 51451 | 1.1% |
| Other values (14) | 204884 | 4.5% |
Common
| Value | Count | Frequency (%) |
| _ | 76222 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4599034 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 965862 | |
| o | 961262 | |
| e | 948727 | |
| N | 901968 | |
| a | 168176 | 3.7% |
| s | 103822 | 2.3% |
| _ | 76222 | 1.7% |
| p | 74796 | 1.6% |
| l | 74681 | 1.6% |
| r | 67183 | 1.5% |
| Other values (15) | 256335 | 5.6% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021102800 | 189 | 37077.0 | 1 | 2021-10-29T00:27:23.000 | 18.0 | ARI | right | 21.38 | 6.94 | 0.00 | 0.00 | 0.00 | 43.74 | 223.19 | None |
| 1 | 2021102800 | 189 | 37077.0 | 2 | 2021-10-29T00:27:23.100 | 18.0 | ARI | right | 21.38 | 6.94 | 0.00 | 0.00 | 0.00 | 44.67 | 243.81 | None |
| 2 | 2021102800 | 189 | 37077.0 | 3 | 2021-10-29T00:27:23.200 | 18.0 | ARI | right | 21.38 | 6.95 | 0.00 | 0.00 | 0.00 | 45.69 | 303.24 | None |
| 3 | 2021102800 | 189 | 37077.0 | 4 | 2021-10-29T00:27:23.300 | 18.0 | ARI | right | 21.38 | 6.94 | 0.00 | 0.00 | 0.00 | 46.44 | 285.89 | None |
| 4 | 2021102800 | 189 | 37077.0 | 5 | 2021-10-29T00:27:23.400 | 18.0 | ARI | right | 21.38 | 6.95 | 0.00 | 0.00 | 0.01 | 47.99 | 341.60 | None |
| 5 | 2021102800 | 189 | 37077.0 | 6 | 2021-10-29T00:27:23.500 | 18.0 | ARI | right | 21.39 | 6.94 | 0.01 | 0.20 | 0.00 | 50.85 | 80.97 | ball_snap |
| 6 | 2021102800 | 189 | 37077.0 | 7 | 2021-10-29T00:27:23.600 | 18.0 | ARI | right | 21.40 | 6.95 | 0.08 | 0.70 | 0.01 | 52.49 | 77.83 | None |
| 7 | 2021102800 | 189 | 37077.0 | 8 | 2021-10-29T00:27:23.700 | 18.0 | ARI | right | 21.44 | 6.95 | 0.44 | 2.46 | 0.04 | 51.82 | 82.23 | None |
| 8 | 2021102800 | 189 | 37077.0 | 9 | 2021-10-29T00:27:23.800 | 18.0 | ARI | right | 21.51 | 6.96 | 0.80 | 3.01 | 0.07 | 53.37 | 82.37 | None |
| 9 | 2021102800 | 189 | 37077.0 | 10 | 2021-10-29T00:27:23.900 | 18.0 | ARI | right | 21.62 | 6.98 | 1.22 | 3.31 | 0.11 | 56.26 | 81.58 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 978939 | 2021110100 | 4433 | NaN | 49 | 2021-11-02T03:20:26.000 | NaN | football | right | 23.39 | 26.84 | 2.73 | 0.88 | 0.28 | NaN | NaN | None |
| 978940 | 2021110100 | 4433 | NaN | 50 | 2021-11-02T03:20:26.100 | NaN | football | right | 23.50 | 27.10 | 2.81 | 0.82 | 0.28 | NaN | NaN | None |
| 978941 | 2021110100 | 4433 | NaN | 51 | 2021-11-02T03:20:26.200 | NaN | football | right | 23.64 | 27.33 | 2.72 | 1.59 | 0.27 | NaN | NaN | None |
| 978942 | 2021110100 | 4433 | NaN | 52 | 2021-11-02T03:20:26.300 | NaN | football | right | 23.80 | 27.54 | 2.63 | 2.15 | 0.26 | NaN | NaN | None |
| 978943 | 2021110100 | 4433 | NaN | 53 | 2021-11-02T03:20:26.400 | NaN | football | right | 23.99 | 27.72 | 2.57 | 2.27 | 0.26 | NaN | NaN | qb_strip_sack |
| 978944 | 2021110100 | 4433 | NaN | 54 | 2021-11-02T03:20:26.500 | NaN | football | right | 24.17 | 27.89 | 2.47 | 2.28 | 0.25 | NaN | NaN | None |
| 978945 | 2021110100 | 4433 | NaN | 55 | 2021-11-02T03:20:26.600 | NaN | football | right | 24.36 | 28.03 | 2.36 | 2.16 | 0.24 | NaN | NaN | None |
| 978946 | 2021110100 | 4433 | NaN | 56 | 2021-11-02T03:20:26.700 | NaN | football | right | 24.55 | 28.17 | 2.25 | 1.45 | 0.23 | NaN | NaN | None |
| 978947 | 2021110100 | 4433 | NaN | 57 | 2021-11-02T03:20:26.800 | NaN | football | right | 24.73 | 28.31 | 2.28 | 0.72 | 0.23 | NaN | NaN | None |
| 978948 | 2021110100 | 4433 | NaN | 58 | 2021-11-02T03:20:26.900 | NaN | football | right | 24.91 | 28.45 | 2.32 | 0.54 | 0.23 | NaN | NaN | None |